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Abstract 

We work in the space of m-hy-n real matrices with the Frobenius inner 
product. Consider the following problem: 

Problem: : Given an m-by-n real matrix A and a positive integer k, find 
the m-by-n matrix with rank k that is closest to A. 

I discuss a rank-preserving differential equation (d.e.) which solves this 
problem. If X(t) is a solution of this d.e., then the distance between X{t) 
and A decreases as t increases; this distance function is a Lyapunov func- 
tion for the d.e. If A has distinct positive singular values (which is a generic 
condition) then this d.e. has only one stable equilibrium point. The other 
equilibrium points are finite in number and unstable. In other words, the 
basin of attraction of the stable equilibrium point on the manifold of ma- 
trices with rank k consists of almost all matrices. This special equilibrium 
point is the solution of the given problem. Usually constrained optimization 
problems have many local minimums (most of which are undesirable). So 
the constrained optimization problem considered here is very special. 
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1. Introduction 

We work in the space M™-^" of m by n real matrices with the "Frobenius" 
(or "eucUdean") inner product. (In an appendix we review the definition 
and elementary properties of this inner product.) We consider the following 
problem: 

Problem: Low rank approximation. Given a matrix A in M^^" and a 

positive integer k, find the matrix with rank k which is closest to A. 

This problem is closely connected with the singular value decomposition 
of matrices. If A = UDV^ where U is an m x m orthogonal matrix, V 
is an n X n orthogonal matrix, and D = Diag((Ti > (T2 > • • • > o"„) is 
a diagonal matrix, then the product UDV^ is called the singular value 
decomposition of A. The diagonal entries of D are the singular values 
of A. It is well-known that every matrix has a singular value decomposition. 
(See, for example, Horn and Johnson(1985) or Demmel(1997).) 

The following result provides a solution of the low rank approximation 
problem: 

Proposition 1. Assume m > n and that the rank of A is greater than the 
positive integer k. Let A = UDV^ be the singular value decomposition of A. 
Let D' := Diag{ai,a2, . . . , cr/,., 0, . . . , 0) and let A' := UD'V^. Then A' is 
the matrix with rank k which most closely approximates A in the Frobenius 
norm. 

This result is often called the Eckart- Young theorem. The result ap- 
peared in Eckart- Young(1936). However, Stewart(1993) points out that it 
was known earlier. 

For a textbook proof of this proposition, see, for example, Horn and 
Johnson(1985) Section 7.4 "Examples and applications of the singular value 
decomposition" . I present an alternative proof of this result in this paper. In 
particular, I discuss a quasi-gradient differential equation which computes 
the solution of the given problem. This proof provides more information 
than other proofs. In particular, it shows that (generically) A' is the unique 
local minimum for the low rank approximation problem. In other words, 
the basin of attraction of this matrix consists of almost all matrices on the 
surface of matrices with rank k. 

The proposition suggests a way to compute the solution of the low rank 
approximation problem: Compute the singular value decomposition of A 
then compute the approximation A'. This procedure is obviously inefficient: 
Why compute all the singular values of A if we only need the largest ones for 
the solution? We shall see that the differential equation is more "economi- 
cal" since its flow is on the manifold Rank(A;) of matrices with rank k. If k 
is small then this manifold has dimension much smaller than the dimension 
of M'"^". I hope that this differential equation can be used to design an 
efficient algorithm for low rank approximation. 
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Since the 1980's there has been significant work with flows on manifolds 
of matrices. In particular, during the 1980's, there was considerable interest 
in continuous analogues of the QR algorithm for computing eigenvalues of 
matrices. The connection between the QR algorithm and the Toda flow 
was discovered by Symes about 1980. For more on this connection, see, for 
example, Symes(1980a,1980b,1982), Deift, Nanda and Tomei (1983), Nanda 
(1982,1985), Chu(1984), and Watkins (1984a,1984b). There are now also 
textbook descriptions of this connection: See, for example, Demmel (1997). 
For some other flows on matrices, see Chu(1986a,1986b), Chu and Driessel 
(1990), Helmke and Moore(1995), Driessel(2004), Driessel and Gerisch(2007) 
and the works cited in these references. 

Flows on manifolds of matrices are interesting not just because of their 
connections with computation, but also for the insight they provide into the 
geometry of the manifolds of interest. This is the main idea in Morse theory. 
Let me say more about such geometric insights. Let be a real vector space 
with an inner product (•,-). Let 5 be a subset of W and let / : 5 — M be 
a real-valued function on S. Then m G S" is a local minumum of / on S, 
if there is a neighborhood N of m such that /(m) is a minimum of / on 
N. I say that / : — > M has the unique local minimum property if / 
is bounded below and has a unique local minimum. In this case the local 
minimum is also the global minimum. Usually an optimization problem has 
numerous (mostly undesirable) local minimums. An optimization problem 
with the unique local minimum property is an especially nice optimization 
problem. 

Here are a few examples. Let ^ be a convex set in with the euclidean 
inner product; let a be a point in M"; let /„ : 5 — > M be defined by to be the 
square of the distance from s e S to a: fa{s) := {a — x,a — x); this function 
has the unique local minimum property for all a. Let S" be a circle in the 
euclidean plane and, for a point a in let /„ : — > R be the square 
of the distance from s to a; this function has the unique local minimum 
property unless a is the center of the circle. 

Here is another example. Consider the following problem: 

Problem: Approximation with spectral constraint. Given an n x n 
symmetrix matrix and real eigenvalues Ai, A2, . . . , A„, find the matrix with 
these eigenvalues that is closest (in the Frobenius norm) to A. 

Chu and Driessel(1990) studied this problem. They showed (by means of 
a "gradient" flow) that it satisfies the unique local minimum property if the 
eigenvalues Aj are distinct. 

Let Rank(/c) denote the set of matrices in IR'"^"- with rank k. Let A be 
a matrix in R™^". Define the function Ja ■ Rank(fc) — >■ R by fA{X) := 
{1/2){A — X,A — X) where (•,•) is the Frobenius inner product. In this 
paper I show that if A has distinct positive singular values then the function 
/a has a finite number of critical values only one of which is a local minimum 
and hence /a has the unique local minimum property. 
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Remark: Helmke and Shayman(1985), in Theorem 4.2(ii), say that /a 
has a finite number of critical points if and only if m = n and A has m 
distinct nonzero singular values. The results that I present here show that 
the condition m = n is not necessary. 

Contents summary: In the section with the title "Setting up the differen- 
tial equation", I review the differential geometry associated with the rank 
approximation problem. (This material appears in Helmke and Moorc(1995) 
and in Helmke and Shayman(1995). I include it to make this paper more 
self-contained.) I also describe the quasi-projection operator associated with 
this problem. (For more on such operators sec Driessel(2004).) In the sec- 
tion with title "Properties of the differential equation", I show that this 
differential equation has the convergence properties asserted above. (This 
differential equation appears in Helmke and Moorc(1995) and Hclmkc and 
Shayman(1995) but they derive it in a more complicated way than I do. 
Their discussion of its equilibrium points is not very clear. They do not 
classify the equilibrium points. They do not discuss basins of attraction.) 

The only prerequisites for understanding (almost all) of this paper arc 
a basic knowledge of differential equations (see, for example, Hirsch and 
Smale(1974)) and basic differential geometry (see, for example, Thorpe(1979)) 



6 



KENNETH R. DRIESSEL 



2. Setting up the differential equation 

In this section we view the sets Rank(/c) of matrices with fixed ranks 
k as parameterized surfaces in the space M"*^". We compute the tangent 
spaces of these constant rank surfaces. Then we define a "quasi-projection" 
map which can be used to transform vector fields in M™^" into vector fields 
tangent to these surfaces. We also define an objective function associated 
with the constrained optimization problem of interest and we compute its 
gradient. Finally, we use the quasi-projection map to convert this gradient 
vector field into one which is tangent to the constant rank surfaces. 

Let Gl{m) denote the general linear group of m by m, invertible, real 
matrices. Recall (see, for example, Birkhoff and MacLane(1953) ) that two 
matrices X and Y in M"*^" are equivalent if there exist matrices G G Gl{m) 
and H G Gl{n) such that Y = GXH~^. Also recall that every matrix M in 
j^mxn jg equivalent to a diagonal matrix D with ones and zeros on its main 
diagonal. The number of ones equals the rank of M. 

We can use the groups Gl{m) and Gl{n) to "parameterize" the matrices 
with rank k as follows. We use the following group action: 

Gl{m) X Gl{n) x m'^x" ^ m'^x" ; {G,H,X) ^ GXH'^. 

For M G R"*^", wc use Orbit (M) to denote the orbit of M under this group 
action; in symbols, 

Orbit(M) := {GMH-^ : G G Gl{m),H e Gl{m)}. 

Let 

Rank(A:) := {X G M""''" : Rank(X) = k}. 

The following proposition summarizes the comments given above. 

Proposition 2. Let k be a positive integer and let K he any matrix with 
rank k. Then the set of matrices with rank k is the same as the orbit of K 
under the given group action; in symbols, 

Rank{k) = Orbit{K). 

For B G Orbit(ii'), I use Tan.Orbit(iC).B to denote the space tangent to 
Orbit (is:) at B. 

Proposition 3. Let K and, B he matrices in W^^^ with B on the orbit of 
K. Then the space tangent to the orbit of K at B is given by 

Tan.Orbit{K).B = {XB + BY : X e M"*^"*, Y G R"^'*}. 

The dimension of this tangent space is k^ + k{m — k) + k{n — k) where 
k := Rank{K). 
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Proof. We simply compute the derivative of the parameterizing map. We 
have (where D denotes the derivative operator) 

D{{G,H) ^ GBH-^).{I,I).{X,Y) 

= {D{G ^ GB).I.X) ■ (7-^) + {IB) ■ {D{H ^ H-^).I.Y) 
= XB + BY 

since 

D{H ^ H-^).C.W = -C-^WC-^. 

(I sometimes use dots for function evaluation in order to reduce the number 
of parentheses. I also use association to the left.) 

Since the orbit is a homogeneous space, it looks the same at all its points. 
Consequently, we can compute the dimension of the tangent space at any 
convenient point. For example, we can do the computation at the diagonal 
matrix with exactly k ones on its diagonal and zeros elsewhere; in particular, 
we can take B := Diag(l^'^, 0^^'^*^)) where I is the minimum of m and n. □ 

For B on the orbit of K, we consider the following linear map: 

Lb := M"^™ x M"^" ^ M"^" : {X, Y) ^ XB + BY. 

Note that the range of this map equals the space tangent to the orbit of K 
at B. We compute the adjoint L*^ of this map. 

Proposition 4. Adjoint of the tangent space map. Let B be on the 

orbit of K. Then the adjoint L*^ of the linear map Lb is the map 

Proof. We have 

{Lb{X,Y),Z) = {XB + BY,Z) = {X,ZB'^) + {Y,B'^Z) 
= {iX,Y),{ZB^,B^Z)). 

Here we have used the "product" inner product on the space K'"^^™- x R"-^" 
which is defined in terms of the Frobenius inner product by: 

((Xi,yi),(X2,y2)) := (^1,^2) + {Yi,Y2). 

□ 

I call the composition Lb o L*^ a "quasi-projection" map. We can use 
this operator to transform vector fields on M"*^'^ into ones which are tan- 
gent to the orbits of interest. For more on the use of quasi-projections see 
Driessel(2004). 

For A in R"^^", we define the objective function / := determined 

by A as the following function: 

^mxn _^^.x^ (1/2)(X -A,X- A). 

In other words, fA{X) is one half the square of the distance from X to A. 
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Proposition 5. Gradient of the objective function. Let A and B he 

matrices in M"*^". The gradient of the objective function fA at B is B — A; 
in symbols, 

VfAiB) = B-A. 

Proof. We simply compute the derivative of / := fA- For X in M"*^", we 
have 

(1) Df.B.X = D{X ^ {1/2){X -A,X- A)).B.X 

(2) = {B - A,D{X ^ X - A).B.X) = {B - A,X). 

□ 

We have a gradient vector field on R"*x" defined by X i— > VfA{X) = 
X—A. But this vector field is generally not tangent to the constant rank sur- 
faces. In other words, the corresponding diff'erential equation X' = VfAiX) 
does not preserve rank. We want to adjust the gradient vector field so that 
the corresponding vector field does preserve rank. We can use the quasi- 
projection map to do so. 

We now compute the quasi-projection of the negative gradient onto the 
tangent space. For B on the orbit of K, we have 

{Lb o L*B){-VfA{B)) = Lb{{A - B)B^, B^{A - B)) 

= {A- B)B'^B + BB'^{A - B). 

In the next section we shall use this formula to define a vector field on the 
space R™"^". We shall then see that the corresponding differential equation 
provides a solution of the constrained optimization problem of interest. 
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3. Properties of the Differential Equation 

In the last section we saw how to adjust the gradient vector field deter- 
mined by the objective function so that the resulting vector field is tangent 
to constant rank submanifolds. We now use that quasi-gradient vector field 
to define a differential equation. 

Using the results of the last section we define the vector field F on M"^^" 
as follows: 

F{X) := {Lx o L*x){A - X) = {A - X)X'^X + XX'^{A - X). 
We consider the differential equation associated with this vector field: 
(*) X' = F{X). 

We shall see later that the solutions of this differential equation are defined 
for all time. In particular we shall see that the solutions do not blow up. 
We shall also see that they converge. 

Note that this differential equation is clearly rank preserving since the 
vector F{X) is tangent to the space Rank(X) at X. The following proposi- 
tion provides a more concrete argument. (In the following analysis, we shall 
only use the fact that the differential equation preserves rank. We shall not 
use the other assertions of this result.) 

Proposition 6. Rank preserving. Let X{t) be the solution of the initial 
value problem 

X' = F{X), X{0) = K. 
Let G{t) and H{t) he solutions of the following initial value problems: 

G' = {A- X)X^G, G{0) = I 

H' = -X'^{A - X)H, H{0) = L. 

Then X(t) = G{t)KH(t)''^ and the rank is invariant. 

Remark: The differential equation for G is determined by a tangent vector 
field on Gl{m) and the differential equation for H is determined by a tangent 
vector field on Gl{n). Note that the expressions {A — X)X'^ and X'^ {A — X) 
appear in the expression defining the vector field F{X). 

Proof. Let Z{t) := G{t)-^ X{t)H{t). Note Z(0) = K and 
Z' = - G-^G'G-^XH + G-^X'H + G'^XH' 

= - G-^{A - X)X^GG-^XH + G-^{{A - X)X^X + XX^{A - X))H 

- G-^XX'^{A - X)H 
= 0. 

Hence Z{t) = K for all t. □ 

The following proposition says that for any solution X{t) of the differential 
equation (*), the distance between X{t) and A decreases. 
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Proposition 7. Lyapunov function. The objective function Ja is a Lya- 
punov function for the differential equation (*). 

Proof. Let X{t) be any solution of (*). To simplify the notation, let / := /a 
and L := Lx- We have 

{d/dt)if{X{t)) = {l/2){d/dt){X -A,X-A) 

= {X-A,X') 

= {X - A,-{LoL*){X - A)) 
= -{L*{X - A),L*{X - A)) < 0. 

□ 

Proposition 8. The solutions of the differential equation (*) are defined 
for all positive times. 

Proof. Let X{t) be a solution of the differential equation. By the last propo- 
sition the distance between A and X{t) decreases as t increases. Hence the 
solution remains in the closed ball with radius \\A — X(0)|| centered at A. 
Since this ball is compact the solution cannot blow up. □ 

Proposition 9. Equilibrium conditions. Let E he an element ofW"^'^. 
Then the following conditions are equivalent: 

(i) E is an equilibrium point of the differential equation (*). 

(ii) E satisfies the equations 

AE'^ = EE^, E'^A = E'^E. 

(iii) A — E is orthogonal to the space tangent the orbit of E at E. 

(iv) E is a critical point of the objective function f a. 

Proof, (i) implies (ii): Let E be an equilibrium point of (*). Then (by the 
proof of the Lyapunov proposition) 

= L%{A -E) = {{A - E)E'^, E'^{A - E)). 

Hence {A - E)E'^ = and E'^{A - E) = 0. 

(ii) implies (i): Assume the E satisfies the given equations. Then we have 
L%{A - E) =0 and hence {Le o L%){A - E) = 0. 

(ii) implies (iii): Assume that E satisfies the given equations. Then for 
any X in M"*x"^ and Y in R"^", we have 

{A- E,XE + EY) = {{A ~ E)E'^ , X) + {E'^ (A ~ E),Y) =0. 

(iii) implies (ii): Assume that A — E is orthogonal to the tangent space. 
Then, for all X in W^x^ and Y in M"^", we have 

0= {A-E,XE + EY) = {{A - E)E^ , X) + {E'^{A -E),Y). 

It follows that E satisfies the given equations. 

(iii) is equivalent to (iv): This equivalence is obvious. □ 
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Proposition 10. Quasi-commuting relations. Let E he an equilibrium 

point of the differential equation (*). Then 

• The matrix E satisfies the equations 

AE'^ = EA^, A^E = E'^A. 

• The matrix E satisfies the equations 

A^AE = E'^AA^, AA^E = EA^A. 

I call the two equations which appear in the first conclusion of this propo- 
sition, the "quasi-commuting" relations for E. 

Proof. We have AE^ = EE'^ and E'^A = E'^E from the proposition char- 
acterizing the equilibrium points. To get the quasi-commuting relations we 
simply use the symmetry of EE'^ and E^E. 

To get the other relations we simply apply the quasi-commuting relations 
repeatedly. In particular, we have 

. A^{AE^) = A^{EA^) = {A^E)A^ = {E^A)A'^ and 

• AiA^E) = A{E'^A) = {AE'^)A = {EA^)A. 

□ 

In the following proof and example, I use E^'' to denote the mxn matrix 
with a one in position pq and zeros elsewhere: Efj := S{i,p)6{j,q). Note 
that these matrices form a basis of the vector space M"*^". 

Proposition 11. Stability of the equilibrium points. // the matrix A 
has distinct positive singular values, then the differential equation (*) has 

isolated equilibrium points only one of which is stable. It follows that the 
solutions of the differential equation converge and that almost all of them 
converge to the stable equilibrium point. 

Remark: Note that the set of matrices with distinct positive singular 
values is a generic (that is, an open and dense) subset of R"*^". 

Proof. We do the case m > n. The proof in the case m < n is essentially 
the same. 

Wc have been working in a coordinate-free way until now. We now choose 
a convenient coordinate system in which to do calculations. In particular, 
we choose the basis so that ^ is a diagonal matrix of ordered singular values: 

A = Diag(ai > a2 > ■ ■ ■ > cr„). 

Claim: If a matrix E is an equilibrium point of the differential equation 
then E is a, diagonal matrix. 

Recall that E must satisfy AA^E = EA^A. We simply calculate these 
matrix products and compare entries. We have {AA^E)ij = crfEij ii i < n 
and {AA'^E)ij = if i > n. We also have {EA'^A)ij = Eija^. We conclude 
that if i < n and i ^ j then afEij = EijO^ and hence Eij = since af ctj. 
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If i > n and i ^ j then = Eijaj and hence Eij = since (t| ^ 0. Thus all 
the ofF-diagonal entries of E must be zero. 

Claim: Let E := Diag(ei, . . . , Cn) be an equilibrium point of the differen- 
tial equation. Then, for i = 1, 2, . . . , n, either Cj = ai or Cj = 0. 

Since the vector field vanishes at E, we have 

= (cTi - ei)ef + ef (cTi - e^) = 2(cri - 6^)6^. 

Claim: The solutions of the differential equation converge. 

Prom the last claim we see that there are a finite number of equilib- 
rium points. A gradient flow confined to a compact set with a finite num- 
ber of equilibrium points must converge. See, for example, Palis and de 
Melo(1982). 

We now turn to the classification of the equilibrium points. Let E = 
Diag(ei, .... e„) be an equilibrium point. We compute the linearization of 
the differential equation at E: We get the linear differential equation 

X' = D.F.E.X = {A- E)X'^E + EX'^{A - E) - XE'^E - EE'^X. 

We regard D.F.E as a linear map on the space tangent to the orbit of E 
at E. (By the way, it is easy to check that this map is self-adjoint.) The 
nature of the equilibrium is determined by this linear map. In particular, the 
equilibrium point E is stable if the eigenvalues of this map arc all negative. 
If this map has a positive eigenvalue then the equilibrium point is unstable. 
We want to see that exactly one of the equilibrium points has all eigenvalues 
negative (a stable situation) and that all of the other equilibrium points have 
at least one positive eigenvalue (an unstable situation). 
We have 

{D.F.E.X)ii = axii. 
where Cj := 2ei{ai — 2ej). For i ^ j, we have 

i^D .F.E .X^'ij — bijXji aijXij 

where a-y := {ef + e'j) and bij := {ai - ei)ej + ei{aj - ej). 

Since the ij entry of D.F.E.X involves only the ij and ji entry of X, we 
temporarily restrict our attention to 2 by 2 matrices. 

We need to find the eigenvalues of the map: 

f^A ^ M (""A , 

where 

bij \ 
-aijj • 

The matrix M has the following form: 
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This matrix has eigenvalues —a ± h. In particular, 



-a b\(l\ r , k\ I ^ 
[a + h) 



and 

( 6 -a) \ ~ ' V-1 
Hence the eigenvalues of M are 

Ai := -(cj + ejf' + GiCj + eiUj 

and 

A2 := -{e-i - ejf - aiCj - eiUj. 

Note that A2 < for all values of ei,ej,ai and aj since these values are 
always nonncgativc. 

Claim: The diagonal matrix E* := Diag((7i, . . . , ak, 0, . . . , 0), where k is 
the rank of the initial matrix K, is a stable equilibrium point. 

We want to see that all the eigenvalues associated with this equilibrium 
point are negative. Note that the set {E^"^ : 1 < p < k or I < q < k} is a 
basis of the space Tan.Orbit(£^*).£^ tangent to the orbit of E* at E*. 

ltl<i<k and I < j < k and i j then aij = —{af + cr|); if 1 < j < A; 
and k < j < n then a^j = —erf; if k < i < m and 1 < j < n then = —(^j- 
li (1 < i < k or I < j < k) and i 7^ j then bij = 0. For i = 1, . . . ,k, 
Ci = —erf. The eigenvalue-vector pairs of D.F.E* are 

. {-{af + a^),E'^) fovl<i<k,l<j <k,i^j, 



{-af,E^^) for 1 < i < A; and k < j < n, 



• (-(jj, for A; < i < m and 1 < i < A;, 

• {-al,Eii) for 1 < i < k. 

Note all these eigenvalues are negative. 

Claim: If E is an equilibrium point is different than E* then E is unstable. 
In this case the set {E^'^ : Cp 7^ or 7^ 0} is a basis for the tangent 
space. 

We want to see that the linear map D.F.E on the tangent space has at 
least one positive eigenvalue. Since E is different than E* , there is an index 
p satisfying 1 < p < A; and = 0. Since E has rank A;, there is an index q 
satisfying p < q and Cq ^ 0. Then Cq = aq. Note that E^i and E^p are in 
the tangent space. We have 

D.F.E.{EPi + E1P) = {bpq + apq){EP'i + ) = (a^ - + ). 

□ 



Example: We do the case m = 4, n = 3 to illustrate the calculations which 
appear in the proof of the last proposition. We consider A := Diag((7i, (72, 0-3) 
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where ai > (72 > (T3 > 0. Let 



E :-- 



( en 

621 

631 
Ve4i 



ei2 

622 

632 
642 



be an equilibrium point. We have that A A is the 4 by 4 diagonal matrix 



613 \ 
623 

633 
643/ 



Diag(crf , cj|, cr|, 0) and A^A is the 3 by 3 diagonal matrix Diag(a 
Hence 



AA^E 



1' "2) 



and 



EA^A 



/o-feii 


0-1612 


c^feisX 


0-1621 


0-^622 


0-^613 


0-3631 


0-3632 


0-3633 


V 





J 




0-^612 


f^iei3\ 


0-^621 


0-^622 


o-ie23 




0-2632 


o-ie33 


\a-fe41 


0-2642 





Equating the entries of these two matrices, we see that that all the ofF- 
diagonal entries of E must be zero. 

We now set E := Diag(ei, 62, 63). We consider the equilibrium equa- 
tion AE^ = EE^. We have AE'^ = Diag(c7iei ,0-262,0-363,0) and EE = 
Diag(ei, 63, 63, 0). Equating the entries of these two matrices we get, for 
i=l,2,3, a-iCi = ef, and hence e.j = cij or = 0. The specified low rank k 
will determine the number of Cj which are zero. 

We turn to the stability classification of the equilibrium points. We have 



D.F.E.X 



E)X'^E + EX'^{A 



+ 



(A 

( (0-1 - 6i)xiiei 
(0-2 - 62)^1261 
(0-3 - 63)xi3ei 


/eiXii(<Ti - ei) 
e2Xi2{cri - ei) 
633^13(0-1 - 61) 




E) 

(c7i - ei)a;2i62 

((72 — 62)^2262 
(0-3 - 63)^2362 


61X21(0-2 - 62) 
622^22(0-2 — 62) 
62) 



xe'^e 



ee'^x 



\ 

\x41el 
( 

^213^12 

&3ia^i3 




Xi26| 

2:4262 
^122:21 






63x2310-2 




X23el 

X336i 
X43e|/ 

&13X3l\ 

&23X32 





(cji - ei)x3ie3\ 
(0-2 - 62)^3263 
(0-3 - 63)^3363 

/ 

61X31(0-3 - e3)\ 
62x32(0-3 - 63) 
63x33(0-3 - 63) 




/ 6^X11 

e|xi2 

6ixi3 







efx2i 

6^X22 

eix23 




+ 



CiXii 

-021X21 
-031X31 
-61X41 



/ 

e?X3i\ 

6ix32 

eix33 




-ai2Xl2 

C2X22 
-032X32 
-63X42 



-ai3Xl3\ 

-023X23 
C3X33 
-63X43 / 
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where (using the same notation as that of the proof) 



2 I 2 

2ej(fTj - 2ej). 



We now consider the stabihty of the equihbrium points when := 2 is 
the given rank. There are three cases. 
Case: E := E* := Diag((7i, (72, 0) 

Note that the tangent space to Orbit at E consists of matrices VE + 
EW where V is in M^^^ and 1^ is in R^^^. It is easy to see that these 
matrices have the following form: 



X 



X2l X22 X23 

X31 X32 

\a;4i X42 J 



If X is such a matrix then 



D.F.E*.X 



-2cr?Xii 
■(af + <t|)x2i 



since 



-afx4i 



aj + al, ai3 



-2C72X22 
-0-2^32 
-(T2X42 



cr'lXi3\ 
J 



-(^2^23 






<7i,«23 



2 

0^2; 



ai2 

&12 = bi3 = 623 = 0, 
ci = —2af, C2 = —2(72, C3 = 0. 

The eigenvalues of D.F.E* are all strictly negative. In particular, the 
eigenvalue-vector pairs are 

{-2alE''),i-ial + al),E'%i-alE'% 
{-{a! + al),E'%{-2alE'%{-alE'% 
i-alE'^) 



-alE^-^) 



Case: E := Diag((Ti, 0, (73) 

Then the tangent space to Orbit (i?) at E consists of matrices having the 
following form: 

/ Xii X12 Xi3\ 

3^21 X23 

a^^Sl 3^32 3^33 

yX4i X43 J 



X 
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If X is such a matrix then D.F.E.X is the following matrix: 



/ 



V 
since 



o-i<72a;2i 


T2(T3X23 












+ 



-crlx2i 
-{aj+al)x3i 
-alx4i 



-crlxi2 


-(^3X32 




-(^1X23 

-2a^X33 
-cr|x43 



ai2 
bi2 



'7l<72, ^13 



2 2 
a I +0-3, 023 



-I 



-2al 



■ 0, 623 = CT20-3, 

ci = —2a\,C2 = 0, and C3 
There is a positive eigenvalue. In particular, 

D.F.E.{E^^ + ^32) = - (J3)c73(£;^^ + E^"^). 
Case: E := Diag(0, (T2, (T3) 

Then the tangent space to Orbit (E') at E consists of matrices having the 
following form: 

/ X12 Xi3\ 

X21 X22 X23 

X31 X32 X33 

\ X42 X43J 
If X is such a matrix then D.F.E.X is 



X 



( 

o-io-2a;i2 
(T1C73X13 

V 

since 



(71(72X21 







Cr2(73X3i 






/ 



+ 



-o\x2\ 
olx3x 



-a^xvi 

-2a\x2-^ 
^2 I „2 



\ 



- (7^X42 



-g\xv3 \ 

-(^2+0-3)2^23 
-2(7|X33 

-0-1x43 J 



^2 „ 
(72,ai3 



^2 „ 
0^3 '"23 



ai2 

612 = (71(72, 613 = 0-1(73,623 

ci = 0, C2 = — 2(7^, and C3 
There is a positive eigenvalue. In particular, 



2 I 2 
0-2 + 0-3, 



0, 



-2ol 



D.F.E.{E^^ ^E^^) = ((71 



(T2)(72(^^^+£;21 



)■ 
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5. Appendix: The Frobenius Inner Product 

We use the "Frobenius" (or "euclidean") inner product in the space M™-^"' 
of m by n real matrices. For X and Y in this space, the Frobenius inner 
product is defined by 

{X,Y) := Trace(Xy^). 

In terms of coordinates, {X,Y) = ^{XijYij : i = 1, . . . ,m,j = 1, . . . ,n}. 
Here we review a few of the properties of this inner product. 

Proposition 12. Adjoints of multiplication maps. Let B and Z be 

elements ofW^'''^. 

• ForXe R"'''"', {XB, Z) = {X, ZB^). 

• ForY e M"^'*, {BY, Z) = {Y, B^Z). 

Proof. We have 

Trace(XBZ^) = Trace(X(ZS^)^) 

and 

Trace(SyZ^) = Trace{YZ^B) = Trace(y(.B^Z)^). 

□ 

Proposition 13. Orthogonal invariance. Let U be an m by m real 
orthogonal matrix and let V be an n by n real orthogonal matrix. Then, for 

all X and Y in M™^", 

• {UX, UY) = {X, Y) and 

• {XV,YV) = {X,Y). 

Proof. By the result concerning the adjoints of multiplication maps, we have: 

• {UX,UY) = {X,U'^UY) = {X,Y) and 

• {XV,YV) = {X,YVV^) = {X,Y) 

□ 



LOW RANK APPROXIMATION 



19 



6. References 

Birkhoff, G. and MacLane, S. (1953) A Survey of Modern Algebra, 
Macmillan. 

Chu, M. (1984) The generalized Toda flow, the QR algorithm, and 
the centre manifold theory, SIAM J. Alg. Discr. Math. 5, 187-201. 

Chu, M. (1986a) A differential equation approach to the singular 
value decomposition of bidiagonal matrices, Lin. Alg. Appl. 80, 
71-80. 

Chu, M. (1986b) A continuous approximation to the generalized 
Schur decomposition, Lin. Alg. Appl. 78, 119-132. 

Chu, M. and Driessel, K.R. (1990) The projected gradient method 
for least squares approximation with spectral constraints, SIAM J. 
Numerical Analysis 27, 1050-1060. 

Deift, P., Nanda, T. and Tomei, C. (1983) Differential equations for 
the symmetric eigenvalue problem, SIAM J. Numer. Analysis 20, 
1-22. 

Dcmmcl, J.W. (1997) Applied Numerical Linear Algebra, SIAM. 

Driessel, K.R. (2004) On computing cannonical forms using flows, 
Lin. Alg. Appl. 379, 353-379. 

Driessel, K.R. and Gerisch, A. (2007) Zero-preserving iso-spectral 
flows bases on parallel sums, Lin. Alg. Appl. 421, 69-84. 

Eckart, G. and Young, G.(1936) The approximation of one matrix 
by another of lower rank, Psychometrika 1, 221-218. 

Helmke, U. and Moore, J.B. (1995) Optimization and Dynamical 
Systems, Springer. 

Helmke, U. and Shayman, M.A. (1995) Critical points of matrix least 
squares distance functions, Lin. Alg. Appl. 215, 1-19. 

Hirsch, M.W. and Smale, S. (1974) Differential Equations, Dynam- 
ical Systems, and Linear Algebra, Academic Press. 

Horn, R.A. and Johnson, C.R. (1985) Matrix Analysis, Cambridge 
University Press. 

Nanda, T. (1982) Isospectral flows on band matrices. Doctoral Dis- 
sertation, Courant Institute, New York. 

Nanda, T. (1985) Differential equations and the QR algorithm, SIAM 
J. Numer. Analysis 22, 310-321. 

Palis, J., Jr. anddeMelo, W. (1982) Geometric Theory of Dynamical 
Systems, Springer. 

Stewart, G.W. (1993) On the early history of the singular value de- 
composition, SIAM Review 35, 551-566. 



20 KENNETH R. DRIESSEL 

• Symes, W.W. (1980a) Systems of Toda type, inverse spectral prob- 
lems, and representation theory, Inventiones Mathematicae 59, 13- 
51. 

• Symes, W.W. (1980b) Hamiltonian group actions and integrable sys- 
tems, Physica ID, 339-374. 

• Symes, W.W. (1982) The QR algorithm and scattering for the finite 
nonperiodic Toda lattice, Physica 4D, 275-280. 

• Thorpe, J. A. (1979) Elementary Topics in Differential Geometry, 
Springer. 

• Watkins, D.S. (1984a) Isospectral flows, SIAM Review 26, 379-392. 

• Watkins, D.S. (1984b) The Toda flow and other isospectral flows, 
Lin. Alg. Appl. 59, 196-201. 

Mathematics Department, Iowa State University 



